Picture for Marco Pedersoli

Marco Pedersoli

ETS

Distilling semantically aware orders for autoregressive image generation

Add code
Apr 23, 2025
Viaarxiv icon

Learning from Stochastic Teacher Representations Using Student-Guided Knowledge Distillation

Add code
Apr 19, 2025
Viaarxiv icon

Progressive Multi-Source Domain Adaptation for Personalized Facial Expression Recognition

Add code
Apr 05, 2025
Viaarxiv icon

Neural Architecture Search by Learning a Hierarchical Search Space

Add code
Mar 27, 2025
Viaarxiv icon

Disentangled Source-Free Personalization for Facial Expression Recognition with Neutral Target Data

Add code
Mar 26, 2025
Viaarxiv icon

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Add code
Feb 03, 2025
Figure 1 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 2 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 3 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 4 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Viaarxiv icon

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Add code
Dec 05, 2024
Figure 1 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 2 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 3 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 4 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Viaarxiv icon

Visual Modality Prompt for Adapting Vision-Language Object Detectors

Add code
Dec 01, 2024
Figure 1 for Visual Modality Prompt for Adapting Vision-Language Object Detectors
Figure 2 for Visual Modality Prompt for Adapting Vision-Language Object Detectors
Figure 3 for Visual Modality Prompt for Adapting Vision-Language Object Detectors
Figure 4 for Visual Modality Prompt for Adapting Vision-Language Object Detectors
Viaarxiv icon

Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation

Add code
Nov 26, 2024
Figure 1 for Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation
Figure 2 for Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation
Figure 3 for Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation
Figure 4 for Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation
Viaarxiv icon

IntentGPT: Few-shot Intent Discovery with Large Language Models

Add code
Nov 16, 2024
Viaarxiv icon